GAE Basic

Google Cloud Platform

GAE (Google App Engine)

Google App Engine is a cloud computing platform for developing and hosting web applications in Google-managed data centers.

GCE (Google Compute Engine)

Google Compute Engine delivers virtual machines running in Google’s innovative data centers and worldwide fiber network.

GKE (Google Container Engine)

https://cloud.google.com/containers/

Dataflow

Dataflow is a unified programming model and a managed service for developing and executing a wide range of data processing patterns including ETL, batch computation, and continuous computation.

BigTable

Cloud Bigtable is Google’s NoSQL Big Data database service. It’s the same database that powers many core Google services, including Search, Analytics, Maps, and Gmail.

BigQuery

BigQuery is Google’s fully managed, petabyte scale, low cost enterprise data warehouse for analytics.

Others

  • PubSub
  • Datastore
  • Storage
  • Monitoring

What is a typical BigData project (web+analysis)

3. GAE Basic-2019-4-28-21-11-47.png

  • Service (GAE/GKE) to receive user generated contents
  • Datastore (BigTable/Spanner) to save contents
  • Index (ElasticSearch) to query contents in runtime
  • Service (GAE/GKE) to serve contents in runtime
  • Dataflow (MapReduce) to dump data into BigQuery
  • BigQuery (SQL) to perform further analysis

GKE is more like a simple GCE

Google App Engine Flex

App Engine flexible environment automatically scales your app up and down while balancing the load. Microservices, authorization, SQL and NoSQL databases, traffic splitting, logging.

Step 3.1.1 In your Intellij, create a new file called app.yaml

The content has just two lines.

GAE 通过flex实现负载均衡

1
2
runtime: go
env: flex

Recommended: upload your two files (main.go and app.yaml) into github.

Worst cast: just copy and paste in terminals.

More readings

Setup cloud terminal.

Question: Why use cloud terminal?
Answer: Such that we do not rely on the local environment (Windows vs. Mac).

Click ‘Activate Google Cloud Shell’. Wait until the init is done

Then you will see a vm available for you to play. This is similar to the EC2 vm.

Install go libraries in gcloud terminal.

1
2
go get gopkg.in/olivere/elastic.v3
go get github.com/pborman/uuid

git clone https://github.com/Luorinz/Around.git

Question: what’s the purpose of app.yaml?

Answer: to tell google that your program is a go program and use go compiler to run it.

Then enter this line of code to make sure you copy past is done.

1
cat main.go | head -n 5

You’re expected to see this

1
2
3
4
5
package main

import (
elastic "gopkg.in/olivere/elastic.v3"
"fmt"

Enter gcloud app deploy

When asked about site, enter 1, which is us-west2

When asked Y/n, enter Y. The deployment may take more than 10 minutes to finish (next deployment will be faster).

Error: If you have error like this, you may have a wrong ES_URL. Update the value and deploy again.

1
2
3
4
5
Updating service [default]...failed.                                                                                         
ERROR: (gcloud.app.deploy) Error Response: [9]
Application startup error:
+ exec app
panic: no Elasticsearch node available

Open Postman, find the post query from history.

(POST, url=http://xxx.appspot.com/post)

Replace xxx with your real project id

1
2
3
4
5
6
7
8
{
"user":"1111",
"message":"一生必去的100个地方",
"location":{
"lat":37.5,
"lon":-120.1
}
}

And then find the get query from history. (GET, url=http://around-xxxx.appspot.com/search?lat=37.5&lon=-120)
You should get the results back

1
2
3
4
5
6
7
8
9
10
[
{
"user": "1111",
"message": "一生必去的100个地方",
"location": {
"lat": 37.5,
"lon": -120.1
}
}
]

Play with more examples. If you do not have any results back, try to change your lat or lon to make sure it’s within 200km.

Check the logging to see if there is any error message

Choose SATCKDRIVER -> Logging -> Logs

(Optional) Stop Instance to save credit.

(Optional) Start the instance again. In the same page, choose one instance and click START. If the START is disabled, refresh it or wait a minute.

(Optional) How to filter word

1
2
3
4
5
6
7
8
9
10
11
12
func containsFilteredWords(s *string) bool {
filteredWords := []string{
"fuck",
"100",
}
for _, word := range filteredWords {
if strings.Contains(*s, word) {
return true
}
}
return false
}

Update relevant part from your code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
func handlerSearch(w http.ResponseWriter, r *http.Request) {

...

for _, item := range searchResult.Each(reflect.TypeOf(typ)) {
p := item.(Post)
fmt.Printf("Post by %s: %s at lat %v and lon %v\n", p.User, p.Message, p.Location.Lat, p.Location.Lon)
// TODO(vincent): Perform filtering based on keywords such as web spam etc.
if !containsFilteredWords(&p.Message) {
ps = append(ps, p)
}

}

...
}

Related string operations
https://golang.org/pkg/strings/

(Optional) Useful Go information

Error vs Panic

  • Error: normal way to handle error status.
  • Panic: normally used when some “impossible” situation happens with no intend recover.

    • Use “recover” to handle when exiting function.
    • recover is like catch

“defer” clause

  • Preferred way to recycle resources
    • E.g. file handles, network connections, etc.
    • Defer executes after return
    • First come last executes
    • Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
func Foo() {
var f File
if f, err = os.Open(“Somefile”); err != nil {
panic(“Error open file”)
}
defer f.Close()

...

// some other error happens
if err != nil {
panic(“Unexpected error”)
}
}
  • Combined with recover to handle panic
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
func Foo() result string, err error {
type openFileError struct{}
type unexpectedError struct{}

defer func() {
switch p := recover(); p {
case nil:
// no panic
case openFileError{}:
err = fmt.Errorf(“Open file error”)
case unexpectedError{}:
err = fmt.Errorf(“Unexpected error”)
default:
panic(p)
}
}()

var f File
if f, err = os.Open(“Somefile”); err != nil {
panic(openFileError{})
}
defer f.Close()

// Some work...

// Some other error happens
if err != nil {
panic(unexpectedError{})
}

return result, nil
}
  • Go’s multiprocessing facility: Communication Sequential Processes
  • Go routine
    • Go’s concurrency mechanism
    • Lightweight (corresponding to fiber)
      • Growable stacks
      • Scheduling
        • m:n scheduling - m goroutine on n OS threads
        • Scheduler invoked implicitly by language construct
          • E.g. time.Sleep, I/O, etc.
  • Have no identity - nothing like thread id
    • Avoid thread-local storage (TLS, tend to be misused, similar to global variable)
    • Easy go routine migration
  • Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// clock.go
package main

import (
"io"
"log"
"net"
"time"
)

func handleConn(c net.Conn) {
defer c.Close()
for {
_, err := io.WriteString(c, time.Now().Format("15:04:05\n"))
if err != nil {
return // e.g., client disconnected
}
time.Sleep(1 * time.Second)
}
}

func main() {
listener, err := net.Listen("tcp", "localhost:8000")
if err != nil {
log.Fatal(err)
}
for {
conn, err := listener.Accept()
if err != nil {
log.Print(err) // e.g., connection aborted
continue
}
handleConn(conn)
//go handleConn(conn) // handle connections concurrently
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// netcat.go
package main

import (
"io"
"log"
"net"
"os"
)

func main() {
conn, err := net.Dial("tcp", "localhost:8000")
if err != nil {
log.Fatal(err)
}
defer conn.Close()
mustCopy(os.Stdout, conn)
}

func mustCopy(dst io.Writer, src io.Reader) {
if _, err := io.Copy(dst, src); err != nil {
log.Fatal(err)
}
}
  • Go channel
    • Communication mechanism for Go routine
    • Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// channel_example.go
package main

import "fmt"

func main() {
messages := make(chan string)

go func() {
for i := 0; i < 5; i++ {
fmt.Println(i)
}
messages <- "done"
}()

msg := <-messages

fmt.Println(msg)
}